An active crawler for discovering geospatial Web services and their distribution pattern - A case study of OGC Web Map Service
نویسندگان
چکیده
This article may be used for research, teaching and private study purposes. Any substantial or systematic reproduction, redistribution , reselling , loan or sub-licensing, systematic supply or distribution in any form to anyone is expressly forbidden. The publisher does not give any warranty express or implied or make any representation that the contents will be complete or accurate or up to date. The accuracy of any instructions, formulae and drug doses should be independently verified with primary sources. The publisher shall not be liable for any loss, actions, claims, proceedings, demand or costs or damages whatsoever or howsoever caused arising directly or indirectly in connection with or arising out of the use of this material. The increased popularity of standards for geospatial interoperability has led to an increasing number of geospatial Web services (GWSs), such as Web Map Services (WMSs), becoming publicly available on the Internet. However, finding the services in a quick and precise fashion is still a challenge. Traditional methods collect the services through centralized registries, where services can be manually registered. But the meta-data of the registered services cannot be updated timely. This paper addresses the above challenges by developing an effective crawler to discover and update the services in (1) proposing an accumulated term frequency (ATF)–based conditional probability model for prioritized crawling, (2) utilizing concurrent multi-threading technique, and (3) adopting an automatic mechanism to update the metadata of identified services. Experiments show that the proposed crawler achieves good performance in both crawling efficiency and results' coverage/liveliness. In addition, an interesting finding regarding the distribution pattern of WMSs is discussed. We expect this research to contribute to automatic GWS discovery over the large-scale and dynamic World Wide Web and the promotion of operational interoperable distributed geospatial services. 1. Introduction The development of geospatial information acquisition methods helps to collect huge amounts of geospatial information. In 2006 alone, NASA's Earth Observing System Data and Information System (EOSDIS) produced over 3 terabytes (TB) of Earth system science data on a daily basis (NASA 2007). The geospatial information is widely utilized in different applications, such as navigation (Rae-Dupree 2006), transportation (Peytchev and Claramunt 2001), urban planning (Stevens et al. 2007), and emergency response (Rauschert et al. 2002). However, they are archived in various forms, and the geospatial applications, provided by different vendors, are highly heterogeneous in data representation, storage, and access (Paul and Ghosh 2006). The heterogeneity makes it difficult …
منابع مشابه
GeoWeb Crawler: An Extensible and Scalable Web Crawling Framework for Discovering Geospatial Web Resources
With the advance of the World-Wide Web (WWW) technology, people can easily share content on the Web, including geospatial data and web services. Thus, the “big geospatial data management” issues start attracting attention. Among the big geospatial data issues, this research focuses on discovering distributed geospatial resources. As resources are scattered on the WWW, users cannot find resource...
متن کاملAutomatic Generation of Geospatial Metadata for Web Resources
Web resources that are not part of any Spatial Data Infrastructure can be an important source of information. However, the incorporation of Web resources within a Spatial Data Infrastructure requires a significant effort to create metadata. This work presents an extensible architecture for an automatic characterisation of Web resources and a strategy for assignation of their geographic scope. T...
متن کاملPublication and Discovery of Semantically Annotated Geospatial Web Services
Environmental information and services have become a crucial asset in the creation of decission support systems. Unfortunately, this information and services are not usually exposed in an interoperable and standard way, limiting their reusability and impact in the community. Publishing and discovering geospatial information and services on the Web is therefore an important challenge in order to...
متن کاملInterpolation of Precipitation Sensor Measurements using OGC Web Services
The standards developed by the Open Geospatial Consortium (OGC) provide a broad foundation for web-based geographical applications. As one of these the OGC Sensor Web Enablement (SWE), which comprises standards for the consumerand provider-oriented sensor viewpoint, allows data requests of real-time sensor measurements and observations, abstracting from the inherent sensor particularities. The ...
متن کاملInformation Services for Grid/Web Service Oriented Architecture (SOA) Based Geospatial Applications
Geographical Information Systems (GIS) presents data-intensive environment for acquiring, processing and sharing geo-data among interested parties. In order to serve geographical information to users in such environment, Service Oriented Architecture (SOA) principles have gained great importance. In SOA-based systems, Information Services support the discovery and handling of these geospatial s...
متن کاملذخیره در منابع من
با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید
برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید
ثبت ناماگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید
ورودعنوان ژورنال:
- International Journal of Geographical Information Science
دوره 24 شماره
صفحات -
تاریخ انتشار 2010